Mean or Arithmetic Mean\(\bar{x}\), Geometric Mean\(\operatorname{GM}(x)\), Harmonic Mean\(\operatorname{HM}(x)\), Median\(\operatorname{median}(x)\) and Mode\(\operatorname{mode}(x)\) are some measures of central tendency in the sample.
Range\(\operatorname{range}(x)\), Semi-Interquartile Range\(\operatorname{SIR}(x)\), Mean Deviation about x’\(\operatorname{MD}_{(x')}(x)\), Variance\(s_x^2\), Standard Deviation\(s_x\) are some measures of dispersion in the sample.
Covariance\(\operatorname{cov}(x, y)\) is a measure of the joint variability of two random variables \(x\), \(y\).
Correlation is any relationship, causal or spurious, between two random variables \(x\), \(y\). Pearson’s correlation coefficient\(r\) estimates the linear correlation.
Good linear correlation lets try to observe line of best fit.
Linear Regression
Simple Univariate Linear Regression is a method for estimating the relationship \(y_i=f(x_i)\) of a response variable \(y\) with a predictor variable \(x\), as a line that closely fits the \(y\) vs. \(x\)scatter plot.
\[
y_i = \hat{a} + \hat{b} x_i + e_i
\]
Where \(\hat{a}\) is the intercept, \(\hat{b}\) is the slope, and \(e_i\) is the ith residual error. We aim to minimize \(e_i\) for better fit.
Ordinary Least Squares
Ordinary Least squares method reduces \(e_i\) by minimizing error sum of squares\(\sum{e_i^2}\).